Transactions on Big Data
نویسندگان
چکیده
Java 8 has introduced new capabilities such as lambda expressions and streams which simplify data-parallel computing. However, as a base language for Big Data systems, it still lacks a number of important capabilities such as processing very large datasets and distributing the computation over multiple machines. This paper gives an overview of the Java 8 Streams API and proposes extensions to allow its use in Big Data systems. It also shows how the API can be used to implement a range of standard Big Data paradigms. Finally, it compares performance with that of Hadoop and Spark. Despite being a proof-of-concept implementation, results indicate that it is a lightweight and efficient framework, comparable in performance to Hadoop and Spark, and is up to 5 times faster for the largest input sizes tested.
منابع مشابه
The Values Challenge for Big Data retrieval
As Big Data and analytics defined on top of Big Data have increasingly greater impacts on society, we humans are becoming incorporated in a Big Data loop: our activities, transactions, posts, and images, are all being recorded as Big Data; and in turn the analysis of Big Data is being used to make decisions that affect us. This paper explores characteristics of this grand loop of Big Data and b...
متن کاملBig Data in Land Records Management in Kenya: A Fit and Viability Analysis
Big data is data whose size is beyond the ability of commonly used software tools to capture, manage, and process within tolerable time. The concept of big data has been necessitated by the growing capacity of the available information systems to facilitate the capture, processing, storage and use of large volumes of variable but credible data fast enough to generate optimum value for the users...
متن کاملChapter 1 . Key Technologies for Big Data Stream Computing
1.1 Introduction Big data computing is a new trend for future computing with the quantity of data growing and the speed of data increasing. In general, there are two main mechanisms for big data computing, i.e., big data stream computing and big data batch computing. Big data stream computing is a model of straight through computing, such as Storm [1] and S4 [2] which do for stream computing wh...
متن کاملInvestigating the Effects of Large Block Transactions and Owner-ship Nature on Non-Financial Disclosure
Having adequate, sufficient and timely information and data is very important for investors' decision making. Processing information and allocating the asset are two fundamental tasks in the securities market and the stock price is more likely to have the nature of disclosure, information effectiveness and asset allocation efficiency. Thus, in this research, the effects of large block transacti...
متن کاملManaging the Synchronization in the Lambda Architecture for Optimized Big Data Analysis
In a world of continuously expanding amounts of data, retrieving interesting information from enormous data sets becomes more complex every day. Solutions for precomputing views on these big data sets mostly follow either an offline approach, which is slow but can take into account the entire data set, or a streaming approach, which is fast but only relies on the latest data entries. A hybrid s...
متن کاملA Data Analytic Algorithm for Managing, Querying, and Processing Uncertain Big Data in Cloud Environments
Big data are everywhere as high volumes of varieties of valuable precise and uncertain data can be easily collected or generated at high velocity in various real-life applications. Embedded in these big data are rich sets of useful information and knowledge. To mine these big data and to discover useful information and knowledge, we present a data analytic algorithm in this article. Our algorit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017